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ABSTRACT 


The  network  design  problem  considered  here  involves  mail  which  is 
sorted  progressively  to  a number  of  final  destinations  by  traveling 
through  a network  of  sorting  madiines,  each  with  one  input  channel  and 
a fixed  (for  all  machines)  number  of  output  channels.  Cost  for  sorting 
is  taken  to  be  the  sum  of  individual  costs  for  sorting  to  each  destination, 
which  in  turn  is  the  product  of  mail  volume  to  that  destination  times 
a "disutility"  per  unit  of  mail  for  the  particular  path  in  the  network  to 
that  destination's  "sink."  A simple  algorithm  and  a dynamic  program 
provide  complete  specification  of  a network  which  minimizes  cost  for 
certain  types  of  disutility  functions. 


1 


V 


1.  PROBLliM  formulation"" 

This  paper  deals  with  a class  of  problems  encountered  in  connection 

with  a study^^^  of  mechanical  mail  sorting  methods.  We  begin  by  describing 

these  problems  in  quite  general  form,  and  then  specialize  to  the  particular 

cases  for  which  solution  methods  will  be  developed  below. 

The  problems  deal  with  the  design  of  a network  of  devices  to  sort 

material  into  a given  number  d of  classes,  e.g.  to  perform  a "sort  by 

destination"  on  pieces  of  mail  each  addressed  to  one  of  d known  cities  or 

regions.  Problem  data  include  nonnegative  numbers  {v.  , where 

^ 1 

v^  = mean  volun^  of  class  i material  to  be  sorted  per  unit  time. 

The  devices  available  for  use  in  the  network  constitute  a set 

^ 1 

(2) 

of  single-input  machines^  , each  of  which  separates  material  into  a given 
number  m of  categories.  A sorting  network  N with  parameters  (n,  m,  d)  is 
defined  to  be  a network  of  nodes  and  directed  arcs  with 

(a)  a single  origin  node,  with  no  incaning  arc  and  one  outgoing  arc, 

(b)  n sorting  nodes,  each  with  one  incoming  arc  and  m outgoing  arcs, 
and  each  containing  one  of  the  machines  , 

am  grateful  to  colleagues  A.J.  Goldman,  L.S.  Joel,  and  J.  Levy  for 
helpful  discussions. 

^^^Computer  Symbolics,  Inc.,  Task  I Report  (6/17/69)  on  The  Operations 

Research  Analysis  and  Design  of  Maximally  Advantageous  Sorting  Configurations, 
under  Post  Office  Department  Contract  No.  RER  28-69. 

(2) 

^ ^Here  "machine"  is  used  in  the  abstract  or  functional  sense,  without 
reference  to  a specific  technology;  similarly,  the  "material"  to  be 
sorted  might  consist  (say)  of  information  rather  than  physical  objects. 


(c)  d destination  nodes,  each  with  one  incoming  arc  and  no  outgoing 


arcs,  and  a one-to-one  association  of  these  nodes  with  the  d classes 
of  material  to  be  separated,  and 

(d)  the  topological  restriction^  ^ that  there  be  exactly  one  directed 
path  in  N from  the  origin  node  to  each  destination  node,  and  hence 
to  each  sorting  node  as  well.  (See  Figure  1 for  an  example  of  a 
sorting  network.) 

In  such  a sorting  neUvork  N,  let  denote  the  unique  path  from  the 
origin  node  to  the  destination  node  associated  with  the  i-th  class  of 
material.  Material  in  this  class  is  to  enter  the  network  at  the  origin 
node  and  (if  no  errors  occur)  flow  along  to  the  associated  destination, 
which  should  thus  be  reached  only  bydass  i material.  The  machine  at  a given 
sorting  node  should  receive  material  only  of  those  classes  i such  that  P^ 
contains  the  sorting  node;  the  machine  in  effect  separates  this  family  of 
classes  into  m subfamilies  for  routing  along  its  m outgoing  arcs. 

Before  proceeding  further,  we  recall  (op^.  cit . in  footnote  1)  that  the 
three  parameters  (n,  m,  d)  are  not  independent.  To  establish  this,  we  count 
in  two  ways  the  arcs  of  a sorting  network:  as  "inccming  arcs,"  so  that 

there  is  one  arc  per  sorting  node  and  one  per  destination  node,  and  as 
"outgoing  arcs  ,"  so  that  there  is  one  arc  for  the  origin  node  and  m for 
each  sorting  node.  Equating  tlie  two  counts,  we  obtain  tlie  relation 

n + d = 1 + nm, 
or  equivalently 

d = 1 + n (m  - 1)  . (1.1) 

^^^In  technical  language,  the  restriction  is  that  N be  a "rooted  tree." 


Figure  I.  Sorting  network  with  m=2,  n=7,d=8 


f 4') 

The  design  of  a sorting  netvvork,  for  given  values  (n,  m)  can  be 
conceptually  divided  into  three  parts : 

(a)  specification  of  topology,  subject  to  (d)  above, 

(b)  associating  tlie  d destination  nodes  one-to-one  with  the  d classes 
of  material,  and 


,n 


(c)  associating  the  n sorting  nodes  one-to-one  with  the  n machines{NL  , 
i.e.  "assigning  the  machines  to  the  sorting  nodes." 

For  an  optimal  design  problem,  we  must  specify  an  "objective  function" 
to  be  maximized  or  minimized.  This  will  first  be  done  in  a quite  general 
way  to  indicate  a class  of  problems  which  may  be  of  interest  for  future  work, 
IVe  then  specialize  to  the  type  of  problems  for  \diidi  the  methods  of  this 
paper  give  a complete  solution. 

For  any  sequence  (j(l),  j (2)  , - - - j(L))  of  distinct  integers  from  the 
set  {1,  2,  - - -,  n},  let  g^[j(l),  - - - j(L)]=  "score"  per  unit  flav  of 
class  i material  through  the  sequence  of  machines  (p)  ‘ 

Furthermore,  for  a sorting  network  N,  let  S^(N)  = sequence  of  machines 
(specified  by  their  indices)  along  the  path  P^.  Then  the  function  to  be 
extremized  by  a proper  dioice  of  N is  tJie  mean  total  score  per  unit  time, 


d 

f(N)  = 2 V g [S  (N)].  (1.2) 

ill  1 

In  considering  how  (1.2)  might  be  specialized  to  tractable  forms  of 
practical  interest,  it  is  useful  to  note  three  aspects  of  the  generality  of 

the  quantities  g^[S^(N)]  appearing  in  (1.2).  The  first  is  the  subscript 

74TT 

Given  m.  and  d,  the  n-value  determined  from  (1.1)  might  be  non-integral; 
in  tliis  case  we  can  think  of  d as  increased  bv  the  adjunction  of  enough 
"dunmy"  classes  witli  no  members  to  raise  n to  the  next  higher  integer  value. 


-4- 


on  g^,  which  inplies  that  different  classes  of  material  should  receive 
different  scores  for  passage  through  a given  sequence  of  machines.  The 


scores  we  have  in  mind  refer  to  the  time  or  cost  of  such  a passage, 

or  the  likelihood  of  damage  or  misrouting  or  of  rejection  as  ''unprocessable"; 

these  are  "disutilities"  rather  than  "utilities,"  so 

that  we  will  be  concerned  with  minimizing  f(N)  rather  than  maximizing  it. 

In  this  context,  the  subscript-dependence  of  g^  might  be  plausible  if  we 
were  classifying  objects  by  size  or  messages  by  length  or  pieces  of  mail 
by  the  degree  of  machine -readability  of  their  addresses.  But  our  motivating 
interest  is  the  classification  of  mail  by  destination,  which  appears  to 
be  at  most  coincidentally  correlated  with  classification  by  score  in  the 
sense  indicated  above.  Thus  we  replace  (1.2)  by 


d 

f(N)  = Z V g [S  (N)].  (1.3) 

1 ^ ^ 

Second,  at  present  the  score  g[S^(N)]  in  (1.3)  depends  not  only  on 
what  set  of  machines  is  encountered  in  movement  along  path  P^,  but  also 
the  order  in  which  they  are  encountered.  Such  generality,  while  perhaps 
needed  for  sane  applications,  does  not  seem  especially  relevant  for  our 
present  purposes  and  will  be  dropped. 

Third,  no  mention  has  been  made  of  what  properties  of  the  machines 
influence  the  scores.  It  will  be  assumed  that  the  relevant  properties 
of  M.  can  be  represented  by  a single  positive  number 


a . 

1 


= indicator  of  disutility  of  passage  through  M.. 


As  a consequence  of  these  three  steps,  the  scoring  functions  g^[j(l), 
j(2),  . . . j(L)]  have  been  specialized  to  a function  g ^j(2)’  ' ’ 

aj(L)],  which  is  unchanged  under  permutations  of  its  nos itive -number  arguments. 
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Before  specializing  further,  three  examples  will  be  given  to  help 
indicate  the  minimum  degree  of  generaliity  we  v^ant  our  final  mathematical 
model  to  possess.  Suppose  fi rst  that  a_.  represents  the  time  required  for 
an  object  to  pass  through  and  on  to  the  next  node,  and  that  our  objective 
is  to  minimize  total  time  spent  in  the  network  by  all  objects.  Then 

8[®j(l)’®j(2)>  • • • ' ®j(l)  ®j(2) 

or  more  compactly 

g[S.(N)]  = E {a.  ; M.e  S.  (N)  (1.4) 

is  the  appropriate  scoring  function^^^  . Next,  assume  a^  is  the  probability 
that  Nh  routes  an  object  correctly  (or,  passes  an  object  undamaged).  If 
the  objective  is  to  maximize  the  expected  number  of  objects  per  unit  time 
which  arrive  at  the  correct  destination  node  (or,  which  surrave  passage 
through  the  network  undamaged) , then  the  appropriate  scoring  function  is 

^j(2) ^j(L)'  “ ®j(2)  • • ■ "“ja)’ 

or  more  compactly 

g[S.(N)]  = -n{a^.  : M.  e S.(N)},  (1.5) 

tlie  minus  sign  arising  because  our  convention  calls  for  a minimization 
rather  than  a maximization.  Third,  the  score  may  depend  only  on  how  many 
madiine -handlings  an  object  receives  in  the  network.  If  we  define  the 
length  of  a path  to  be  the  number  of  sorting  nodes  it  contains,  and  employ 
tlie  notation 

L^(N)  = length  of  , (1.6) 


(5) 


Ihe  special  subcase  with, 
footnote  (1)  . 


all  a.  = l is  the  one  solved  in  tlie  cited  reference, 
1 
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f 5a) 

then  for  an  appropriate  strictly  monotone  function^  the  scoring 
function  would  be  given  by 

g[S.  (N)]  = h[L.(N)].  (1.7) 

In  this  case  the  machines  M.  are  in  effect  assumed  identical . 

J 

To  encompass  these  three  cases  while  retaining  the  stipulations  given 
previously  on  g,  and  to  give  a basis  for  developing  a solution  algorithm, 
we  consider  a continuous  mathematical  operation  which  converts  an 
ordered  pair  (x,y)  of  numbers  from  some  interval  I of  real  numbers  into  a 
number  x*y  of  I,  (Here  the  interval  I can  be  finite  or  infinite,  open  or 
half-open  or  closed.)  The  operation  is  assumed  to  satisfy  the  associative 
law 

(x*y)  *z  = x*(y*z) 

(in  algebra,  the  "semi-group"  property),  and  to  have  the  further  property 
that 

x*y  = x*2  or  y*x  = z*x  implies  y = z. 


He  then  define  the  scoring  function  as 


where  h is  continuous  and  strictly  monotone.  (With  as  addition  and  h(x)  = 

X we  get  special  case  (1.4)  or,  if  all  = special  case  (1.7).  With  "*"  as 
multiplication  and  h(x)  = -x,  we  get  (1.5).) 

(7) 

Nav  we  note  that  the  above  definition  of  implies  that  there  exists  a 
continuous,  strictly  monotonically  increasing  function  (j)  defined  on  a 


(5  a) 

^ '^Function  h is  called  strictly  monotone  if  either  h(x)<h(y)  whenever  x<y, 
or  h(x)>h(y)  whenever  x;y. 

With  h(x)Ex,  this  again  reduces  to  the  case  treated  in  the  cited  reference, 
f 7) 

See,  for  example,  Aczel,  J.,  Lectures  on  Functional  Equations  aiul  their 
Application,  Academic  Press,  1966,  pp.  253  ff. 
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subset  of  and  with  range  I,  such  that 


C7a) 


x*y  = (p  ((p  ^(x)  + 


Thus  the  scoring  function  may  also  be  written  as 


(7b) 


(1.9) 


-1 

= h*[z  ♦ Ta  ,,J].  (1.10) 

Since  h and  (j)are  strictly  monotone,  so  is  h =h({).  Thus  by  relabeling 
machine  j with  the  quantity  a^  = <p  ^ (a^),  we  may  assume  that  the  function 
g has  the  form 

L 

g[a.(i),  aj(L)]  = h (1.11) 

where  h is  strictly  monotone. 

Although  the  problem  has  not  been  solved  in  this  generality,  a simple 
algorithm  and  a dynamic  program  give  the  solution  for  tu'o  special  cases 
of  interest,  as  detailed  in  sections  2 and  3 respectively.  The  first  two 
examples  given  (cf.  equations  (1.4)  and  (1.5))  are  solvable  by  the  algorithm 
of  Section  2,  while  the  third  (1.7)  is  solvable  by  the  dynamic  program. 
Section  4 discusses  two  specific  examples  of  the  type  problem  of  Section  3 
which  are  of  seeming  practical  importance. 


(x)  is  the  unique  number  u such  that  Ku)  = x. 

('7b') 

^Here  h({)  is  defined  by  hct>(u)  = h [<|)(u)]. 
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2.  THE  ALGORITHM 


We  consider  in  this  section  the  class  of  functions  h given  previously, 
further  restricted  so  that 

h(x*a)  - h(y*a)  = 0(a)  [h(x)  - h(y)],  (2.1) 

where  0 is  a positive  continuous  function,  monotone  in  the  direction  opposite 
to  that  of  h.  Applying  the  transformation  described  in  the  previous  section, 
(2.1)  becanes 

h({)(())  ^(x)+(j)  ^(a))  - h4)((j)  ^(y)  +(}>  ^(a))  =0(f>(4)  ^(a))  [h<j)((j)  ^(x))-  hcj)((j)'^(y))  ] , 
or, 

h (x  + a)  - h(y+a)  = 0(a)  [h(x)  - H(y)],  (2,2) 

where  0 is  still  monotone  in  the  direction  opposite  to  that  of  h.  We  use  the 
functions  h and  0 and  the  new  labels  in  the  discussion  that  follcws. 

First,  however,  we  prove  that  equation  (2.2)  implies  that  h and  0 have 
the  form 


0(x)  = e^^  - oo<x<ooj  (2.3) 

and 

k^e^+k^  (c=^0)  (2.4) 

"^k^  X +k2  (c=0)  (2.5) 

which  implies  that 

0(x)  = e (2.6) 

and 

k^e^'^‘^^^+k2  (c=fO)  (2.7) 

"^k^  i|^(x)  +k2  (c=0)  (2.8) 


where  These  relationships  are  shown  as  follows : First,  it  is  clear 

that  0(ma)  = (6(a) )^  for  any  positive  integer  m. 
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For, 


0(ma)  [h(x) -h(y)  ] = h(x+ma)  - h(y+ma)  = a)G(a)  [h(x) -h(y)  ] 

= [0(a)]"'[h(x)-fi(y)] 

m 1 

if  0((m-l)a)  = (0(a))  . Thus  a simple  inductive  argument  shows  that 

0(ma)  = [0(a)  ]^,  provided  we  take  h(x)  = h (y)  in  the  above  equation. 

T /m 

Therefore  0(a/m)  = [0(a)]  ' , for  each  positive  integer  m,  so  that  if 

0(1)  = e^  then  0(j/m)  = Thus  0(x)  = e*“^  holds  for  all  positive 

rational  numbers  x = j/m;  by  continuity,  it  holds  for  all  x ^ 0. 

Similarly,  we  have 

0(-ma)  [h(x) -h(y)  ] = h(x-ma)  - h(y-ma) 

= 0(-(mi+l)a)0(a)[h(x)-h(y)], 

or 

0(-(m+l)a)  = [0(a)]  ^ 0(-ma), 

whidi  leads  to  0(x)  = e for  x ^ 0,  as  above. 

To  shav  tliat  h is  of  the  form  (2.4)  or  (2.5),  we  note  that  if  seme 

h satisfies  (2.2),  then  so  does  K^h  + for  any  constants  and  K^. 

Let  h(x)  be  any  function  satisfying  (2.2)  for  0 = e*"^,  c ]=  0.  Then,  by 
the  last  remark,  we  may  assune  that  h(0)  =0  and  h(l)  =}=  e*"  - 1.  Now  for 
any  x let  k be  defined  by 

h(x)  = k(e^^  -1)  . 

By  induction,  it  may  be  shorn  that 
h(mx)  = k(e°^^  -1) 
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for  any  positive  integer  m,  since  if  this  is  true  for  m-1  we  have,  by 

(2.1), 

h(mx) -h(  (m-l)x)  = e^^”^  ^^^[h(x)  - h(0)], 


or 

h(mx)  = -1)  .e'=("’-«^[k(e“-l)]  = k . 


Letting  x = 1/m,  we  get  k=l  so  that 

h(l/m)  = e^^*^  -1, 
and  by  the  previous  argument, 

h(p/m)  = e^^  -1 

for  all  positive  integers  p and  m.  Thus  by  the  continuity  of  h, 
h(x)  = e^^  - 1, 

for  all  X ^ 0.  This  result  is  easily  extended  to  negative  numbers  by 
showing  that 

h (-mx)  = e -1 

in  a fashion  similar  to  the  above.  Thus  (2.4)  is  proved. 

If  c = 0,  it  may  be  shown  that  setting  h(0)  =0  and  h(l)  = 1 results  in 


h(x)  = X,  by  an  argument  analogous  to  the  above. 

We  nav  develop  tlie  basis  for  the  algorithm, 
scoring  function  will  be  given  by 


L 


where  h satisfies  (2.2)  and  a . 


-1 


i(i) 


Thus  (2.5)  is  also  proved. 
In  what  follows,  the 


For  convenience , let 
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the  numbering  of  classes  and  machines  be  such  that 


(2.9) 


and 


(2.10) 


if  h is  increasing,  while 


(2.10') 


if  h is  decreasing. 


We  also  introduce  the  follcwing  two  definitions.  In  any  sorting  net- 
work N,  a final  sorting  node  is  one  whose  outgoing  arcs  all  tenninate  in 
destination  nodes,  and  a chain  is  a path  from  the  origin  node  to  a final 
sorting  node. 

The  following  lemma  is  based  solely  on  the  numbering  (2.9),  (2.10) 
or  (2.10'),  and  the  form  of  the  function  f(N)  . 

LEMMA  1 . There  is  an  optimal  network  for  which  some  final  sorting  node^s 
labeled  n and  the  final  destinations  which  it  sorts  to  are  labeled  1,  2,  ..., 
m. 

PROOF.  Let  N be  any  optimal  network.  Choose  a destination  node  corresponding 
to  sane  class  i,  such  that  g[S^(N)]  is  maximum.  Let  x be  the  final  sorting 
node  preceding  tliis  destination  node.  If  x does  not  sort  to  classes  1,  2,  ..., 
m,  then  there  exist  integers  p and  q,  with  1 q ^ m < p,  such  that  x sorts 
to  class  p but  not  to  class  q.  Consider  the  new  network  N'  formed  by  inter- 
dianging  destination  classes  p and  q in  N.  We  have 
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f(N)  - f(N’)  = (Vpg[Sp(N)]  + v^g[S^(N)]) 


- (Vpg[S^(N)]  + v^g[Sp(N)]) 

= (Vp-v^)(g[Sp(N)]  - g[S^CN)]) 

= (Vp-v^)(g[S.(N)]  - g[S^(N)]) 

by  (2.9)  and  the  fact  that  g[S^(N)]  is  maximal.  Thus  interchanging  classes 
p and  q in  N does  not  destroy  optimality.  By  an  inductive  process,  we  may 
move  all  of  the  destinations  1,  2,  . . . , m to  final  sorting  node  x without 
destroying  optimality.  Therefore,  in  the  rest  of  the  proof  we  assume  that 
X sorts  to  nodes  1,  2,  . . . , m. 

Suppose  that  node  x contains  machine  where  k =j=  n.  Let 

Q = {i  : n e S^(N) }. 

Then  Q contains  at  least  m members.  Let  R be  any  subset  of  Q of  cardinality 
m,  and  let  distinct  numbers  j(i)  ^ R be  defined  so  that 

V.  < V...S,  i = 1,  2,  . . .,  m.  (2.11) 

1 — j (i)  ’ > > » 

Clearly  tliis  is  possible,  by  (2.9). 

Consider  the  network  N'  formed  by  interchanging  machines  k and  n in  N. 

In  the  network  N let 

= E{a^  : s e S^(N)  , s f n > 
for  r eQ,  and  let 
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u 


= E{a^  : s e S^(N) , s =|=  k}. 


Then  we  have 


m 


f(N")  - f(N)  = [h(u  + a^)  - h(u+a^)]  S 

i=l 


+ Z v^[h(u^+a^)  - h(u^+a^)] 

rcQ 


(2.12) 


Now  suppose  that  h is  increasing  and  0 decreasing  in  what  follows . 
(A  similar  argument  applies  in  the  opposite  case.)  Since  a,  < a , by 

(2 . 10) , we  have 

h(Ur+\)  - h (u^+a^)  ^ 0, 

for  each  reQ-R,  so  that 


m 


f(N  ) - f(N)  < [h(u+a  ) - h(u+i)]  Z v. 

n ^ i=l  ^ 


+ Z v^[h(u^+aj^)  - h(u^+a  )] 
reR 


m 


> Z {v^[h(u+a^)  - h(u+a^)j 
i=l 


+ V. . . s [h(u. . +a,  ) - h (u....,  + a )]}  . 

j(i)  ^ j(i)  k"  j(i)  n'-* 


(2.15) 


If  f(N')  - f(N)  ^ 0,  tlien  the  lemma  has  been  proved  since  N'  is  also 
optimal.  Thus  it  is  sufficient  to shav , because  of  (2.13),  that 

Vj[h(u.J^)  - h(u+a^)]  ^ V.J.J  [h(u.(.j  + a^)  - h(u.(.j  * a^)) 

j;^0,  i = l,2,  ...,m. 
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Because  ^ a^,  h(Uj^^+aj^)  + a^) , as  noted  above. 

Furthermore,  v....,  > v.  , so  that 
’ j(i)  - i’ 

v.[h(u+a^)  - h(u+i^)]  * v.(j,[h(u.^.j+a^)  - h(u.(.j  ^a^)l 

<v.  [h(u+a^)-  h(u+a.)]  + v-  [h(u....  +a,  ) - h(u....  + a )] 

— 1 n ^ k ■'  1 ^ 3(1}  ^ 3(1)  n -* 

= v.[h(u*J^)  - h(u*i^)  * 

= v^{e(u)[H  (a^)  - h (a^)]  - e(u^ ) [h(i^)  - h(a^)]} 

= v.[0(u)  - 0(u.^.^)][h(i^)  - h(aj^)]. 

But  u+a^^  ^ maximality  of  g[Sj^(N)]  , while  a^^  ^ a^,  implying 

U>Uj|,i^, 

Thus 

e(u)  ^ > 

while 

h(a^)  ^h(aj,), 

implying  the  desired  inequality.  This  completes  the  proof  of  tlie  lemma. 

We  now  state  the  special  solution  algorithm  which  applies  when  condition 
(2.1)  is  satisfied.  It  is  assumed  that  we  are  dealing  with  the  transformed 
functions  h and  6,  together  with  transformed  a^’s,  in  this  first  version 
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of  the  algorithm.  Such  transfoimation  will  be  shown  to  be  unnecessary 
in  the  corollary  to  theorem  1. 

The  algorithm  is  described  (for  fixed  m)  by  recursion  on  the  number 
n of  sorting  nodes,  with  d given  by  (1.1).  Since  the  solution  for  n=l 
is  obvious,  we  give  the  recursion  step  fron  n-1  to  n. 

SPECIAL  ALGORITHM.  Number  the  classes  and  machines  so  that  (2.9) 
and  (2.10  - 2.10')  hold.  Consider  a new  problem  with  the  d-m  original 
classes  m+1,  m+2,  . . .,  d plus  a new  class  d+1  with 


= e(a  ) T V. , 
d+1  ^ n'^  1 i’ 


(2.14) 


and  with  machine-set’  ^ and  associated'  {a.},^  By  the  recursion 

1 1 1 1 ^ 

O 

hypothesis,  an  optimal  network  N^_^  for  this  new  problem  is  available.  In 
this  network,  "expand”  the  destination  node  corresponding  to  class  d+1  to 
a sorting  node  occupied  by  with  outgoing  arcs  terminating  in  destination 
nodes  associated  with  classes  1,  2,  . . . , m.  This  yields  a sorting  net- 

O O 

work  for  the  original  problem;  is  an  optimal  solution  to  this  problem. 

Less  formally,  the  algorithm's  instructions  are  as  follows.  Choose 
m classes  with  the  smallest  v^'s,  and  a machine  with  the  largest  or 
smallest  a^  according  as  h is  increasing  or  decreasing.  Destination  nodes 
corresponding  to  the  chosen  classes  will  terminate  arcs  from  a final  sorting 
node  occupied  by  the  chosen  machine.  Now  apply  the  same  instruction  to  a 
situation  involving  all  the  remaining  machines,  and  all  the  remaining  classes 
plus  an  artificial  one  whose  "v^"  is  ^(^j)  times  the  sum  of  the  m smallest 
v^'s;  the  node  occupied  by  is  then  identified  with  the  "new"  destination 
node  corresponding  to  the  artificial  class.  Repeat  the  process  (reducing 
the  number  of  classes  by  m-1  each  time)  until  one  class  is  left. 
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THEOREM  1 . The  special  algorithm  produces  an  optimal  sorting  network 
when  h satisfies  (2.1). 


PROOF.  The  proof  is  by  induction  on  n,  where  d is  given  by  (1.1). 

Let  N be  any  sorting  network  for  the  n-node  problem,  and  let  x be  any 
final  sorting  node  with  disutility  a which  sorts  to  destinations  j(l), 
j(2),  ....j(m).  Let  N’  be  another  sorting  network  with  n-1  sorting  nodes 
derived  from  N by  replacing  destinations  j(l),  j(2),  . . .,  j(m)  with  a 
destination  d+1  having  volume  0(a)  E ^j(i)  replacing  node  x with  a 

destination  node  labeled  d+1.  Then  it  is  clear  that  if  N and  N'  have  the 

same  scoring  function  h,  we  have 

m _ _ m 

f(N)  - f(N')  = (Z  V...J  h(u+a)  - 0(a)  (Z  v....)  h(u) 

i=l  i=i  JUj 

. . . m 

= (Z  v..-0[h(u+a)  - h(a)]  - 0(a)(z  v...O[h(u)  -h(0)] 

i=l  i=l 

+ h(a)  E V , , - e(a)h(0)  E V.,., 
i=l  i=l  ^ 

= [h(a)  - 0(a)  H(0)]  Z V....,  by  (2.2). 

i=l 

Nav  let  be  any  optimal  network  on  the  original  n sorting  nodes.  By 

lemma  1,  we  may  assume  that  there  is  a final  sorting  node  in  N labeled  n 

which  sorts  to  destinations  1 through  m.  Let  be  the  network  on  n-1 

sorting  nodes,  with  destination  m+1,  m+2,  . . . , d,  d+1  and  sorting 

disutilities  a^,  a^,  . . • » j derived  from  by  replacing  sorting  node 

n and  its  destination  nodes  by  a destination  node  d+1,  where 
_ _ m 

Vi^T  = 0(a  ) Z V.  . 
d+1  ^ n . T 1 

1=1 
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By  the  above  remarks , we  have 

m 

f(Np  - £(Ny  = [h(i^)  - e(i^)  h(0j]_E^  V..  (2.15) 

But  if  ^ and  N°^  are,  respectively  the  networks  produced  by  the 

algorithm  operating  on  the  reduced  set  of  machines  and  destinations  and 

on  the  original  set,  then  it  is  also  clear  that 

m 

f(Np  - f(N;_^)  = [h(d^)  - e(i^)  h (0)]_Z  V.,  (2.16) 

o 

from  the  construction  given  in  the  algorithm.  But  since  is  optimal, 

by  the  induction  assumption,  it  follows  that  Jl  f(Np  , and  hence 

from  (2.15)  and  (2.16)  that  f(N^)  <_  f(N^).  Thus  is  optimal  and  the 
theorem  is  proven. 

Next  we  show  that  the  conversion  function  <()  of  (1.9)  is  really 
unnecessary. 

COROLLARY.  In  the  algorithm,  let  the  function  6 be  substituted  for  6 
and  the  original  values  of  the  a^  be  substituted  for  their  transformed 
values.  Then  the  algorithm  still  gives  the  optimal  solution. 

PROOF.  Clearly 

e(a^)  = 0<l5(<{>'^(a^))  =e(a^),  (2.17) 

while 

a.<  a,  if  and  only  if  a.  < a.  , 
j—  k ’ j — k’ 

and  0 is  monotone  increasing  (decreasing)  if  and  only  if  6 is  monotone 
increasing  (decreasing)  , by  the  fact  that  c()  is  monotone  increasing.  Thus 
the  ordering  of  subscripts  for  the  a's  is  not  changed,  so  that  is  used 
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defined 


in  t±ie  recursion  step  of  the  algorithm,  while  by  (2.17)  the 

by  either  method  is  the  same.  Thus  each  application  of  the  algorithm 
gives  the  same  result  under  the  original  as  under  the  transformed  system, 
and  so  the  final  network  is  the  same. 
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3.  A DYNAMIC  PROGRAM 


In  this  section  we  obtain  a dynamic  programming  formulation  for  the 
problem  of  minimizing 

f(N)  = Ev.  h[L  (N)],  (3.1) 

1 ^ 

where  as  before  Lj.(N)  is  the  number  of  sorting  nodes,  in  N,  in  the  path 
from  the  origin  node  to  the  destination  node  associated  with  class  i. 

Note  that  the  quantities  appear  explicitly  in  this  problem. 

The  function  h is  assumed  monotone  increasing.  Two  examples  will  be  dis 
cussed  in  Section  4. 

As  in  Section  2,  we  adopt  the  numbering  convention 


V,  < < . . . < Vj. 

1 — z — — d 


(3.2) 


,(8) 


Using  a standard  theorem^  ^ on  "rearrangements,"  we  find  that  there  must 
be  an  optimal  N with 


L^(N)  > L2(N)  > . . . 1 L^(N). 


(3.3) 


Define  N to  be  an  M -stage  network  if  it  satisfies  (3.3)  and  has 
Lj^(N)  = max^L^(N)  = M.  (3.4) 

This  clearly  requires  M _<  n. 

In  terms  of  the  problem  data  define  an  M-stage  network  N'  to 

be  of  type  (s,z)  if  it  satisfies  (3.3),  has  s sorting  nodes  (s^n)  and 
therefore 

f 81 

^See  Chapter  10  of  Inequalities , Hardy,  Littlewood  and  Polya,  Cambridge 
University  Press  (1952) . 
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D(s)  = s(m-l)  + 1 = d - (n-s)(m-l) 


i 

destination  nodes,  and  has  D(s)  classes  with  data  {v 


}D(s) 

k=l 


sudi  that 


^ d+l-k  = ^d+l-k 


for  1 < k < D(s)  - z, 


(3.5) 


for  D(s)  - z < k < D(s) , 

(3.6) 

'''dn-k  “ ° 

implies  L^+i_p(N')  = M. 

(3.7) 

The  original  problem  is  to  find  an  optimal  network  of  type  (n,0)  . For 
this  it  suffices  to  find,  for  each  M^n,  an  M-stage  network  which  is  optimal 
in  the  class  of  M-stage  type  (n,0)  networks.  To  employ  a dynamic  programming 
approach,  we  imbed  this  in  the  following  class  of  problems:  For  each 

triple  (M,s,z)  with  M^s^n  and  z<D(s)  for  which  an  M-stage  type  (s,z)  network 
exists,  find  an  optimal  M-stage  nehvork  of  type  (s,z).  Let 

fj^(s,z)  = minimum  value  of  f(N’)  over  all  M-stage  networks  N'  of 
type  (s , z)  . 

Clearly  f^  is  defined  only  for  s=l  and  z^m,  and  is  given  by 
fyi,z)  = (I  hci). 

K=1 

To  complete  the  dynamic  programming  formulation,  we  develop  a recursion 
expressing  in  terms  of  fj^^. 
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Let  be  an  optimal  (M+1) -stage  type  (s,z)  network.  In 

consider  all  final  nodes  of  chains  of  length  M+1,  and  all  destination  nodes 
which  follow  them;  by  (3.6) -(3.7),  these  include  at  least  the  destination 


labeled  D(s)-z+l  through  D(s)  , whose  volume  is  0 as  stated  in  (3.6). 

Suppose  these  final  sorting  nodes  are  q in  number  so  that  qm^z.  Collapse 
each  one  of  these  sorting  nodes,  together  with  the  m destination  nodes 
which  follow  it,  to  a single  new  destination  node  associated  with  a new 
class  having  ”v|  = 0".  This  yields  an  M-stage  network  N'j^,  of  type  (s-q,q), 
such  that 


Moreover  the  construction  is  reversible,  in  the  following  sense: 
Given  any  M-stage  type  (s-q,q)  network  Nj^  and  a value  of  z with  z^qm  and 
z^D(s) , the  destination  nodes  of  Nj^  corresponding  to  its  first  q classes 
can  each  be  "expanded”  to  a sorting  node  folloved  by  m destination  nodes, 
in  such  away  that  an  (M+1) -stage  t>q)e  (s,z)  network  results.  Fran  these 
considerations  we  obtain  the  desired  recursion. 


nodes  associated  with  the  first  z classes  of  N that  is,  the  classes 


- qm 

where  the  range  for  q can  be  limited  by  the  conditions  z<qm^D(s) . 


(3.8) 
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4.  EXAMPLES  FOR  SECTION  3 


Two  practical  problems  which  are  amenable  to  the  dynamic  program 
solution  of  section  3 are  as  follows.  In  the  first  instance,  it  is  desired 
to  minimize  a cost  of  sorting,  which  is  composed  of  a cost  for  each  unit  of 
material  sorted  by  each  machine  plus  a cost  for  material  which  is  lost  (or 

misdirected)  at  a given  sorting.  If  we  assume  that  a fraction  cf^l  of  the 

input  to  a sort  remains  after  sorting,  that  the  cost  per  unit  per  sort  is 

a,  and  that  the  cost  for  a unit  lost  in  the  sorting  process  is  L,  then  the 

function  h of  section  3 is  given  by 

h(M)  = a(l+q+. . .+q^"^)  + L(l-q^) , 
and  so 

h(M+l)  - h(M)  = aq  + L(q  - q ) 

= q^(a+L(l-q))>  0 

Thus  h is  monotonic,  and  the  method  of  section  3 applies. 

The  second  problem  is  as  follows.  Part  of  the  material  being  sorted 
is  lost  at  each  sort,  but  everything  lost  in  the  process  of  sorting  is 
reinserted  at  the  initial  node  for  anotlier  sort,  and  this  process  is 
repeated  until  everything  has  ccme  through  the  sorting  network.  This 
might  apply,  for  instance,  to  the  sorting  of  mail.  The  objective  is  to 
minimize  tlie  total  costcf  sorting,  where  the  cost  per  sort  per  machine  is 
a and  a fraction  q of  the  input  to  a sorting  node  comes  through  without  loss. 
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For  a single  sort,  if  an  amount  is  inserted  at  th.e  origin  bound 
for  destination  i,  and  if  the  length  of  the  path  is  M,  then  the  cost 

is 

M-1 

v^a(l+q+. . .+q  ). 

M M 

However,  since  q ccmes  through  the  process,  an  amount  pv^,  \diere  p=l-q  , 

2 

is  lost  and  must  be  put  through  again.  On  the  third  time,  p is  inserted, 
and  so  on.  The  total  cost  for  an  initial  amount  is  therefore 


(1+p+p  +...)  v^a(l+q+, 

, , M 

(q^)  V-  a — ) 
^1-p  1 ^1-q  ^ 


^ M-1. 

• +q  ) 


M 

V.  a E q ^ . 


Thus 

M 

h(M)  = a E q 

j = l 


and 

-M-l 

h(M+l)  -h(M)  = aq  >0, 


proving  monotonicity  of  h , so  that  Section  3 applies. 
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